Recurrent Residual Module for Fast Inference in Videos

نویسندگان

Bowen Pan

Wuwei Lin

Xiaolin Fang

Chaoqin Huang

Bolei Zhou

Cewu Lu

چکیده

Deep convolutional neural networks (CNNs) have made impressive progress in many video recognition tasks such as video pose estimation and video object detection. However, CNN inference on video is computationally expensive due to processing dense frames individually. In this work, we propose a framework called Recurrent Residual Module (RRM) to accelerate the CNN inference for video recognition tasks. This framework has a novel design of using the similarity of the intermediate feature maps of two consecutive frames, to largely reduce the redundant computation. One unique property of the proposed method compared to previous work is that feature maps of each frame are precisely computed. The experiments show that, while maintaining the similar recognition performance, our RRM yields averagely 2× acceleration on the commonly used CNNs such as AlexNet, ResNet, deep compression model (thus 8−12× faster than the original dense models using the efficient inference engine), and impressively 9× acceleration on some binary networks such as XNOR-Nets (thus 500× faster than the original model). We further verify the effectiveness of the RRM on speeding up CNNs for video pose estimation and video object detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recurrent Residual Learning for Action Recognition

Action recognition is a fundamental problem in computer vision with a lot of potential applications such as video surveillance, human computer interaction, and robot learning. Given pre-segmented videos, the task is to recognize actions happening within videos. Historically, hand crafted video features were used to address the task of action recognition. With the success of Deep ConvNets as an ...

متن کامل

Fast Intra Mode Decision for Depth Map coding in 3D-HEVC Standard

three dimensional- high efficiency video coding (3D-HEVC) is the expanded version of the latest video compression standard, namely high efficiency video coding (HEVC), which is used to compress 3D videos. 3D videos include texture video and depth map. Since the statistical characteristics of depth maps are different from those of texture videos, new tools have been added to the HEVC standard fo...

متن کامل

Discovering Discrete Latent Topics with Neural Variational Inference

Topic models have been widely explored as probabilistic generative models of documents. Traditional inference methods have sought closedform derivations for updating the models, however as the expressiveness of these models grows, so does the difficulty of performing fast and accurate inference over their parameters. This paper presents alternative neural approaches to topic modelling by provid...

متن کامل

The Recurrent Temporal Restricted Boltzmann Machine

The Temporal Restricted Boltzmann Machine (TRBM) is a probabilistic model for sequences that is able to successfully model (i.e., generate nice-looking samples of) several very high dimensional sequences, such as motion capture data and the pixels of low resolution videos of balls bouncing in a box. The major disadvantage of the TRBM is that exact inference is extremely hard, since even computi...

متن کامل

An Efficient Cluster Head Selection Algorithm for Wireless Sensor Networks Using Fuzzy Inference Systems

An efficient cluster head selection algorithm in wireless sensor networks is proposed in this paper. The implementation of the proposed algorithm can improve energy which allows the structured representation of a network topology. According to the residual energy, number of the neighbors, and the centrality of each node, the algorithm uses Fuzzy Inference Systems to select cluster head. The alg...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1802.09723 شماره

صفحات -

تاریخ انتشار 2018

Recurrent Residual Module for Fast Inference in Videos

نویسندگان

چکیده

منابع مشابه

Recurrent Residual Learning for Action Recognition

Fast Intra Mode Decision for Depth Map coding in 3D-HEVC Standard

Discovering Discrete Latent Topics with Neural Variational Inference

The Recurrent Temporal Restricted Boltzmann Machine

An Efficient Cluster Head Selection Algorithm for Wireless Sensor Networks Using Fuzzy Inference Systems

عنوان ژورنال:

اشتراک گذاری